On Smoothing and Enhancing Dynamics of Pitch Contours Represented by Discrete Orthogonal Polynomials for Prosody Generation
نویسنده
چکیده
This paper presents a new pitch contour generation algorithm for statistical syllable-based logF0 generation models which represent logF0 contours of syllables by coefficients of discrete orthogonal polynomials, i.e. orthogonal expansion coefficients (OECs). The conventional statistical logF0 models can generate smooth pitch contour within a syllable because of the continuity property of polynomials. However, the models do not ensure to produce continuous and smooth logF0 contours in the proximity of syllable junctures. Besides, dynamic range of the generated logF0 contours is generally smaller than the one of real speech. The above two shortcomings would result in unnatural and monotonous prosody. To overcome these shortcomings, juncture-smooth and dynamics-enhancing OEC generation algorithms are hence proposed in this paper. Analysis on the generated logF0 contours by the proposed algorithm shows some improvements in logF0 smoothness at syllable junctures and enhanced logF0 dynamic range. In addition, a perceptual evaluation of the logF0 contour generated by the proposed algorithm shows an improvement in naturalness of the synthesized speech.
منابع مشابه
The Function of Pitch Range Variations in Samples of Emotional Expressions in Persian
This study aims at investigating the interface between emotion and intonation patterns (more specifically, duration and pitch amplitude of speech). To this end, the acoustic properties of spectral parameters related to speech prosody are investigated. The results of acoustic and Statistical analysis show that mean level and range of FO in the contours vary strongly as a function of the degree o...
متن کاملRecurrences and explicit formulae for the expansion and connection coefficients in series of the product of two classical discrete orthogonal polynomials
Suppose that for an arbitrary function $f(x,y)$ of two discrete variables, we have the formal expansions. [f(x,y)=sumlimits_{m,n=0}^{infty }a_{m,n},P_{m}(x)P_{n}(y),] $$ x^{m}P_{j}(x)=sumlimits_{n=0}^{2m}a_{m,,n}(j)P_{j+m-n}(x),$$ we find the coefficients $b_{i,j}^{(p,q,ell ,,r)}$ in the expansion $$ x^{ell }y^{r},nabla _{x}^{p}nabla _{y}^{q},f(x,y)=x^{ell }y^{r}f^{(p,q)}(x,y) =sumli...
متن کاملThe implementation of phrasal prosody by native and non-native speakers of English: SS ANOVA for multi-syllabic intonation contours
Non-native speakers often have difficulties with prosody; stress and intonation patterns that differ from those of native speakers can contribute to “foreign accent,” even at high proficiency levels. Although effects of a listener’s native language on the perception of prosody are well established, few studies have examined non-native prosody production. In particular, it is not known whether t...
متن کاملCharacterization of Emotions Using the Dynamics of Prosodic Features
In this paper the dynamics of prosodic parameters are explored for recognizing the emotions from speech. The dynamics of prosodic parameters refer to local or fine variations in prosodic parameters with respect to time. The proposed dynamic features of prosody are represented by : (1) sequence of durations of syllables in the utterance (duration contour), (2) sequence of fundamental frequency v...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016